Overview

Dataset statistics

Number of variables39
Number of observations260601
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory77.5 MiB
Average record size in memory312.0 B

Variable types

BOOL22
NUM9
CAT8

Warnings

building_id has unique values Unique
geo_level_1_id has 4011 (1.5%) zeros Zeros
age has 26041 (10.0%) zeros Zeros
count_families has 20862 (8.0%) zeros Zeros

Reproduction

Analysis started2020-09-28 10:01:30.894964
Analysis finished2020-09-28 10:02:33.746467
Duration1 minute and 2.85 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

building_id
Real number (ℝ≥0)

UNIQUE

Distinct260601
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean525675.4828
Minimum4
Maximum1052934
Zeros0
Zeros (%)0.0%
Memory size2.0 MiB
2020-09-28T18:02:33.974322image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile52114
Q1261190
median525757
Q3789762
95-th percentile1000724
Maximum1052934
Range1052930
Interquartile range (IQR)528572

Descriptive statistics

Standard deviation304544.999
Coefficient of variation (CV)0.5793403136
Kurtosis-1.203878964
Mean525675.4828
Median Absolute Deviation (MAD)264277
Skewness0.001882356737
Sum1.369915565e+11
Variance9.274765644e+10
MonotocityNot monotonic
2020-09-28T18:02:34.129742image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
10526701< 0.1%
 
8473041< 0.1%
 
3681021< 0.1%
 
7299861< 0.1%
 
9005781< 0.1%
 
8964801< 0.1%
 
8084151< 0.1%
 
8125051< 0.1%
 
2902641< 0.1%
 
2697821< 0.1%
 
Other values (260591)260591> 99.9%
 
ValueCountFrequency (%) 
41< 0.1%
 
81< 0.1%
 
121< 0.1%
 
161< 0.1%
 
171< 0.1%
 
ValueCountFrequency (%) 
10529341< 0.1%
 
10529311< 0.1%
 
10529291< 0.1%
 
10529261< 0.1%
 
10529211< 0.1%
 

geo_level_1_id
Real number (ℝ≥0)

ZEROS

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.90035341
Minimum0
Maximum30
Zeros4011
Zeros (%)1.5%
Memory size2.0 MiB
2020-09-28T18:02:34.272480image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3
Q17
median12
Q321
95-th percentile27
Maximum30
Range30
Interquartile range (IQR)14

Descriptive statistics

Standard deviation8.033616625
Coefficient of variation (CV)0.5779433361
Kurtosis-1.213248785
Mean13.90035341
Median Absolute Deviation (MAD)6
Skewness0.2725303548
Sum3622446
Variance64.53899608
MonotocityNot monotonic
2020-09-28T18:02:34.395460image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%) 
6243819.4%
 
26226158.7%
 
10220798.5%
 
17218138.4%
 
8190807.3%
 
7189947.3%
 
20172166.6%
 
21148895.7%
 
4145685.6%
 
27125324.8%
 
Other values (21)7243427.8%
 
ValueCountFrequency (%) 
040111.5%
 
127011.0%
 
29310.4%
 
375402.9%
 
4145685.6%
 
ValueCountFrequency (%) 
3026861.0%
 
293960.2%
 
282650.1%
 
27125324.8%
 
26226158.7%
 

geo_level_2_id
Real number (ℝ≥0)

Distinct1414
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean701.0746851
Minimum0
Maximum1427
Zeros38
Zeros (%)< 0.1%
Memory size2.0 MiB
2020-09-28T18:02:34.533424image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile69
Q1350
median702
Q31050
95-th percentile1377
Maximum1427
Range1427
Interquartile range (IQR)700

Descriptive statistics

Standard deviation412.7107336
Coefficient of variation (CV)0.5886829782
Kurtosis-1.188232475
Mean701.0746851
Median Absolute Deviation (MAD)349
Skewness0.02895738139
Sum182700764
Variance170330.1496
MonotocityNot monotonic
2020-09-28T18:02:34.680244image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
3940381.5%
 
15825201.0%
 
18120800.8%
 
138720400.8%
 
15718970.7%
 
36317600.7%
 
46317400.7%
 
67317040.7%
 
53316840.6%
 
88316260.6%
 
Other values (1404)23951291.9%
 
ValueCountFrequency (%) 
038< 0.1%
 
12040.1%
 
377< 0.1%
 
43150.1%
 
525< 0.1%
 
ValueCountFrequency (%) 
14276< 0.1%
 
14262860.1%
 
14254660.2%
 
14247< 0.1%
 
14233< 0.1%
 

geo_level_3_id
Real number (ℝ≥0)

Distinct11595
Distinct (%)4.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6257.876148
Minimum0
Maximum12567
Zeros2
Zeros (%)< 0.1%
Memory size2.0 MiB
2020-09-28T18:02:34.849226image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile611
Q13073
median6270
Q39412
95-th percentile11927
Maximum12567
Range12567
Interquartile range (IQR)6339

Descriptive statistics

Standard deviation3646.369645
Coefficient of variation (CV)0.5826848532
Kurtosis-1.213896506
Mean6257.876148
Median Absolute Deviation (MAD)3171
Skewness0.0003935120899
Sum1630808782
Variance13296011.59
MonotocityNot monotonic
2020-09-28T18:02:35.011396image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
6336510.2%
 
91336470.2%
 
6215300.2%
 
112464700.2%
 
20054660.2%
 
114404550.2%
 
77234430.2%
 
92293810.1%
 
24523490.1%
 
122583120.1%
 
Other values (11585)25589798.2%
 
ValueCountFrequency (%) 
02< 0.1%
 
16< 0.1%
 
39< 0.1%
 
514< 0.1%
 
621< 0.1%
 
ValueCountFrequency (%) 
125671< 0.1%
 
125657< 0.1%
 
125646< 0.1%
 
1256324< 0.1%
 
125623< 0.1%
 

count_floors_pre_eq
Real number (ℝ≥0)

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.129723217
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Memory size2.0 MiB
2020-09-28T18:02:35.139609image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median2
Q32
95-th percentile3
Maximum9
Range8
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.7276645453
Coefficient of variation (CV)0.3416709456
Kurtosis2.322597881
Mean2.129723217
Median Absolute Deviation (MAD)0
Skewness0.8341129586
Sum555008
Variance0.5294956905
MonotocityNot monotonic
2020-09-28T18:02:35.246879image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%) 
215662360.1%
 
35561721.3%
 
14044115.5%
 
454242.1%
 
522460.9%
 
62090.1%
 
739< 0.1%
 
91< 0.1%
 
81< 0.1%
 
ValueCountFrequency (%) 
14044115.5%
 
215662360.1%
 
35561721.3%
 
454242.1%
 
522460.9%
 
ValueCountFrequency (%) 
91< 0.1%
 
81< 0.1%
 
739< 0.1%
 
62090.1%
 
522460.9%
 

age
Real number (ℝ≥0)

ZEROS

Distinct42
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26.53502865
Minimum0
Maximum995
Zeros26041
Zeros (%)10.0%
Memory size2.0 MiB
2020-09-28T18:02:35.395208image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q110
median15
Q330
95-th percentile60
Maximum995
Range995
Interquartile range (IQR)20

Descriptive statistics

Standard deviation73.56593652
Coefficient of variation (CV)2.772408408
Kurtosis157.2482363
Mean26.53502865
Median Absolute Deviation (MAD)10
Skewness12.19249422
Sum6915055
Variance5411.947016
MonotocityNot monotonic
2020-09-28T18:02:35.542953image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=42)
ValueCountFrequency (%) 
103889614.9%
 
153601013.8%
 
53369712.9%
 
203218212.3%
 
02604110.0%
 
25243669.3%
 
30180286.9%
 
35107104.1%
 
40105594.1%
 
5072572.8%
 
Other values (32)228558.8%
 
ValueCountFrequency (%) 
02604110.0%
 
53369712.9%
 
103889614.9%
 
153601013.8%
 
203218212.3%
 
ValueCountFrequency (%) 
99513900.5%
 
200106< 0.1%
 
1952< 0.1%
 
1903< 0.1%
 
1851< 0.1%
 

area_percentage
Real number (ℝ≥0)

Distinct84
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.018050583
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Memory size2.0 MiB
2020-09-28T18:02:35.693954image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q15
median7
Q39
95-th percentile16
Maximum100
Range99
Interquartile range (IQR)4

Descriptive statistics

Standard deviation4.392230936
Coefficient of variation (CV)0.5477928694
Kurtosis30.43825794
Mean8.018050583
Median Absolute Deviation (MAD)2
Skewness3.526082314
Sum2089512
Variance19.29169259
MonotocityNot monotonic
2020-09-28T18:02:35.854082image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
64201316.1%
 
73675214.1%
 
53272412.6%
 
82844510.9%
 
9221998.5%
 
4192367.4%
 
10156136.0%
 
11139075.3%
 
3118374.5%
 
1275812.9%
 
Other values (74)3029411.6%
 
ValueCountFrequency (%) 
190< 0.1%
 
231811.2%
 
3118374.5%
 
4192367.4%
 
53272412.6%
 
ValueCountFrequency (%) 
1001< 0.1%
 
963< 0.1%
 
901< 0.1%
 
865< 0.1%
 
854< 0.1%
 

height_percentage
Real number (ℝ≥0)

Distinct27
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.434365179
Minimum2
Maximum32
Zeros0
Zeros (%)0.0%
Memory size2.0 MiB
2020-09-28T18:02:35.992463image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile3
Q14
median5
Q36
95-th percentile9
Maximum32
Range30
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.918418221
Coefficient of variation (CV)0.3530160667
Kurtosis14.31852616
Mean5.434365179
Median Absolute Deviation (MAD)1
Skewness1.808261757
Sum1416201
Variance3.68032847
MonotocityNot monotonic
2020-09-28T18:02:36.265975image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%) 
57851330.1%
 
64647717.8%
 
43776314.5%
 
73546513.6%
 
32595710.0%
 
8139025.3%
 
293053.6%
 
953762.1%
 
1044921.7%
 
119170.4%
 
Other values (17)24340.9%
 
ValueCountFrequency (%) 
293053.6%
 
32595710.0%
 
43776314.5%
 
57851330.1%
 
64647717.8%
 
ValueCountFrequency (%) 
3275< 0.1%
 
311< 0.1%
 
282< 0.1%
 
262< 0.1%
 
253< 0.1%
 
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
t
216757 
n
35528 
o
 
8316
ValueCountFrequency (%) 
t21675783.2%
 
n3552813.6%
 
o83163.2%
 
2020-09-28T18:02:36.397416image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-09-28T18:02:36.473268image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:36.557773image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

foundation_type
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
r
219196 
w
 
15118
u
 
14260
i
 
10579
h
 
1448
ValueCountFrequency (%) 
r21919684.1%
 
w151185.8%
 
u142605.5%
 
i105794.1%
 
h14480.6%
 
2020-09-28T18:02:36.670081image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-09-28T18:02:36.752841image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:36.855011image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

roof_type
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
n
182842 
q
61576 
x
 
16183
ValueCountFrequency (%) 
n18284270.2%
 
q6157623.6%
 
x161836.2%
 
2020-09-28T18:02:36.975321image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-09-28T18:02:37.063684image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:37.150469image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1
Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
f
209619 
x
24877 
v
24593 
z
 
1004
m
 
508
ValueCountFrequency (%) 
f20961980.4%
 
x248779.5%
 
v245939.4%
 
z10040.4%
 
m5080.2%
 
2020-09-28T18:02:37.263698image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-09-28T18:02:37.343572image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:37.443257image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

other_floor_type
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
q
165282 
x
43448 
j
39843 
s
 
12028
ValueCountFrequency (%) 
q16528263.4%
 
x4344816.7%
 
j3984315.3%
 
s120284.6%
 
2020-09-28T18:02:37.558612image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-09-28T18:02:37.642792image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:37.734729image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

position
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
s
202090 
t
42896 
j
 
13282
o
 
2333
ValueCountFrequency (%) 
s20209077.5%
 
t4289616.5%
 
j132825.1%
 
o23330.9%
 
2020-09-28T18:02:37.853336image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-09-28T18:02:37.941080image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:38.033547image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1
Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
d
250072 
q
 
5692
u
 
3649
s
 
346
c
 
325
Other values (5)
 
517
ValueCountFrequency (%) 
d25007296.0%
 
q56922.2%
 
u36491.4%
 
s3460.1%
 
c3250.1%
 
a2520.1%
 
o1590.1%
 
m46< 0.1%
 
n38< 0.1%
 
f22< 0.1%
 
2020-09-28T18:02:38.143909image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-09-28T18:02:38.229301image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:38.366507image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
237500 
1
 
23101
ValueCountFrequency (%) 
023750091.1%
 
1231018.9%
 
2020-09-28T18:02:38.437111image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
1
198561 
0
62040 
ValueCountFrequency (%) 
119856176.2%
 
06204023.8%
 
2020-09-28T18:02:38.479559image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
251654 
1
 
8947
ValueCountFrequency (%) 
025165496.6%
 
189473.4%
 
2020-09-28T18:02:38.523091image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
255849 
1
 
4752
ValueCountFrequency (%) 
025584998.2%
 
147521.8%
 
2020-09-28T18:02:38.565811image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
242840 
1
 
17761
ValueCountFrequency (%) 
024284093.2%
 
1177616.8%
 
2020-09-28T18:02:38.609597image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
240986 
1
 
19615
ValueCountFrequency (%) 
024098692.5%
 
1196157.5%
 
2020-09-28T18:02:38.652583image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
194151 
1
66450 
ValueCountFrequency (%) 
019415174.5%
 
16645025.5%
 
2020-09-28T18:02:38.695269image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
238447 
1
 
22154
ValueCountFrequency (%) 
023844791.5%
 
1221548.5%
 
2020-09-28T18:02:38.738891image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
249502 
1
 
11099
ValueCountFrequency (%) 
024950295.7%
 
1110994.3%
 
2020-09-28T18:02:38.782320image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
256468 
1
 
4133
ValueCountFrequency (%) 
025646898.4%
 
141331.6%
 
2020-09-28T18:02:38.825773image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
256696 
1
 
3905
ValueCountFrequency (%) 
025669698.5%
 
139051.5%
 
2020-09-28T18:02:38.873714image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
v
250939 
a
 
5512
w
 
2677
r
 
1473
ValueCountFrequency (%) 
v25093996.3%
 
a55122.1%
 
w26771.0%
 
r14730.6%
 
2020-09-28T18:02:38.963090image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-09-28T18:02:39.049067image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:39.146761image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

count_families
Real number (ℝ≥0)

ZEROS

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9839486418
Minimum0
Maximum9
Zeros20862
Zeros (%)8.0%
Memory size2.0 MiB
2020-09-28T18:02:39.249745image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median1
Q31
95-th percentile2
Maximum9
Range9
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.4183889779
Coefficient of variation (CV)0.425214244
Kurtosis17.67094319
Mean0.9839486418
Median Absolute Deviation (MAD)0
Skewness1.634757873
Sum256418
Variance0.1750493368
MonotocityNot monotonic
2020-09-28T18:02:39.348397image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
122611586.8%
 
0208628.0%
 
2112944.3%
 
318020.7%
 
43890.1%
 
5104< 0.1%
 
622< 0.1%
 
77< 0.1%
 
94< 0.1%
 
82< 0.1%
 
ValueCountFrequency (%) 
0208628.0%
 
122611586.8%
 
2112944.3%
 
318020.7%
 
43890.1%
 
ValueCountFrequency (%) 
94< 0.1%
 
82< 0.1%
 
77< 0.1%
 
622< 0.1%
 
5104< 0.1%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
231445 
1
29156 
ValueCountFrequency (%) 
023144588.8%
 
12915611.2%
 
2020-09-28T18:02:39.421783image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
243824 
1
 
16777
ValueCountFrequency (%) 
024382493.6%
 
1167776.4%
 
2020-09-28T18:02:39.465911image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
251838 
1
 
8763
ValueCountFrequency (%) 
025183896.6%
 
187633.4%
 
2020-09-28T18:02:39.509618image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
258490 
1
 
2111
ValueCountFrequency (%) 
025849099.2%
 
121110.8%
 
2020-09-28T18:02:39.553628image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
260356 
1
 
245
ValueCountFrequency (%) 
026035699.9%
 
12450.1%
 
2020-09-28T18:02:39.601152image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
260507 
1
 
94
ValueCountFrequency (%) 
0260507> 99.9%
 
194< 0.1%
 
2020-09-28T18:02:39.644547image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
260322 
1
 
279
ValueCountFrequency (%) 
026032299.9%
 
12790.1%
 
2020-09-28T18:02:39.687104image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
260552 
1
 
49
ValueCountFrequency (%) 
0260552> 99.9%
 
149< 0.1%
 
2020-09-28T18:02:39.729603image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
260563 
1
 
38
ValueCountFrequency (%) 
0260563> 99.9%
 
138< 0.1%
 
2020-09-28T18:02:39.772952image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
260578 
1
 
23
ValueCountFrequency (%) 
0260578> 99.9%
 
123< 0.1%
 
2020-09-28T18:02:39.815772image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
0
259267 
1
 
1334
ValueCountFrequency (%) 
025926799.5%
 
113340.5%
 
2020-09-28T18:02:39.858485image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Interactions

2020-09-28T18:02:12.676173image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:12.860044image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:13.041171image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:13.223771image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:13.403695image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:13.588458image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:13.761101image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:13.938336image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:14.110050image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:14.282238image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:14.966897image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:15.162355image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:15.360877image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:15.573802image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:15.818097image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:16.027186image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:16.209567image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:16.385276image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:16.555966image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:16.739924image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:16.931472image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:17.131334image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:17.324291image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:17.531742image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:17.721594image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:17.905944image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:18.099963image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:18.278877image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:18.456079image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:18.633060image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:18.830491image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:19.004732image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:19.188648image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:19.364923image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:19.544705image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:19.726207image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:19.904924image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:20.094353image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:20.279833image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:20.569986image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:20.750944image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:20.947371image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:21.129098image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:21.324446image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:21.518934image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:21.709945image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:21.886868image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:22.060461image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:22.233933image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:22.402680image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:22.577524image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:22.742098image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:22.915757image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:23.084865image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:23.249628image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:23.425361image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:23.597976image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:23.773474image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:23.948220image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:24.138945image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:24.313800image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:24.493485image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:24.669023image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:24.844023image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:25.060765image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:25.255408image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:25.450351image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:25.637560image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:25.831738image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:26.008587image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:26.181909image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:26.350903image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:26.521104image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:26.698188image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:26.872436image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:27.173079image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:27.341241image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:27.513124image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:27.675758image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:27.844115image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:28.008477image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2020-09-28T18:02:39.991229image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-09-28T18:02:40.500643image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-09-28T18:02:41.226502image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-09-28T18:02:41.753312image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2020-09-28T18:02:42.313626image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2020-09-28T18:02:28.750450image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-09-28T18:02:31.183122image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Sample

First rows

building_idgeo_level_1_idgeo_level_2_idgeo_level_3_idcount_floors_pre_eqagearea_percentageheight_percentageland_surface_conditionfoundation_typeroof_typeground_floor_typeother_floor_typepositionplan_configurationhas_superstructure_adobe_mudhas_superstructure_mud_mortar_stonehas_superstructure_stone_flaghas_superstructure_cement_mortar_stonehas_superstructure_mud_mortar_brickhas_superstructure_cement_mortar_brickhas_superstructure_timberhas_superstructure_bamboohas_superstructure_rc_non_engineeredhas_superstructure_rc_engineeredhas_superstructure_otherlegal_ownership_statuscount_familieshas_secondary_usehas_secondary_use_agriculturehas_secondary_use_hotelhas_secondary_use_rentalhas_secondary_use_institutionhas_secondary_use_schoolhas_secondary_use_industryhas_secondary_use_health_posthas_secondary_use_gov_officehas_secondary_use_use_policehas_secondary_use_other
080290664871219823065trnfqtd11000000000v100000000000
1288308900281221087ornxqsd01000000000v100000000000
29494721363897321055trnfxtd01000000000v100000000000
3590882224181069421065trnfxsd01000011000v100000000000
420194411131148833089trnfxsd10000000000v100000000000
53330208558608921095trnfqsd01000000000v111000000000
672845194751206622534nrnxqsd01000000000v100000000000
747551520323122362086twqvxsu00000110000v100000000000
84411260757721921586trqfqsd01000010000v100000000000
99895002688699410134tinvjsd00000100000v100000000000

Last rows

building_idgeo_level_1_idgeo_level_2_idgeo_level_3_idcount_floors_pre_eqagearea_percentageheight_percentageland_surface_conditionfoundation_typeroof_typeground_floor_typeother_floor_typepositionplan_configurationhas_superstructure_adobe_mudhas_superstructure_mud_mortar_stonehas_superstructure_stone_flaghas_superstructure_cement_mortar_stonehas_superstructure_mud_mortar_brickhas_superstructure_cement_mortar_brickhas_superstructure_timberhas_superstructure_bamboohas_superstructure_rc_non_engineeredhas_superstructure_rc_engineeredhas_superstructure_otherlegal_ownership_statuscount_familieshas_secondary_usehas_secondary_use_agriculturehas_secondary_use_hotelhas_secondary_use_rentalhas_secondary_use_institutionhas_secondary_use_schoolhas_secondary_use_industryhas_secondary_use_health_posthas_secondary_use_gov_officehas_secondary_use_use_policehas_secondary_use_other
26059156080520368598012553nrnfjsd01000000000v111000000000
260592207683101382190322555trnfqsd01000010000v100000000000
2605932264218767861325135trnfqsd01000000000v111000000000
260594159555271811537601312trnfxjd00001000000v100000000000
2605958270128268471822085trnfqsd01000000000v100000000000
260596688636251335162115563nrnfjsq01000000000v100000000000
2605976694851771520602065trnfqsd01000000000v100000000000
2605986025121751816335567trqfqsd01000000000v100000000000
26059915140926391851210146trxvsjd00000100000v100000000000
260600747594219910131076nrnfqjd01000000000v300000000000